{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/pinecone-io/examples/blob/master/learn/generation/langchain/handbook/11-langchain-expression-language.ipynb) [![Open nbviewer](https://raw.githubusercontent.com/pinecone-io/examples/master/assets/nbviewer-shield.svg)](https://nbviewer.org/github/pinecone-io/examples/blob/master/learn/generation/langchain/handbook/11-langchain-expression-language.ipynb)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### [LangChain Handbook](https://pinecone.io/learn/langchain)\n",
    "\n",
    "# LangChain Expression Language (LCEL)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The **L**ang**C**hain **E**xpression **L**anguage (LCEL) is a abstraction of some interesting Python concepts into a format that enables a \"minimalist\" code layer for building chains of LangChain components.\n",
    "\n",
    "LCEL comes with strong support for:\n",
    "\n",
    "1. Superfast development of chains.\n",
    "2. Advanced features such as streaming, async, parallel execution, and more.\n",
    "3. Easy integration with LangSmith and LangServe."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!pip install -qU \\\n",
    "    langchain==0.0.345 \\\n",
    "    anthropic==0.7.7 \\\n",
    "    cohere==4.37 \\\n",
    "    docarray==0.39.1"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## LCEL Syntax"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "To understand LCEL syntax let's first build a simple chain in typical Python syntax."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "from langchain.chat_models import ChatAnthropic\n",
    "from langchain.prompts import ChatPromptTemplate\n",
    "from langchain.schema.output_parser import StrOutputParser\n",
    "\n",
    "ANTHROPIC_API_KEY = \"<<YOUR_ANTHROPIC_API_KEY>>\"\n",
    "\n",
    "if ANTHROPIC_API_KEY == \"<<YOUR_ANTHROPIC_API_KEY>>\":\n",
    "    raise ValueError(\"Please set your ANTHROPIC_API_KEY\")\n",
    "\n",
    "prompt = ChatPromptTemplate.from_template(\n",
    "    \"Give me small report about {topic}\"\n",
    ")\n",
    "model = ChatAnthropic(\n",
    "    model=\"claude-2.1\",\n",
    "    max_tokens_to_sample=512,\n",
    "    anthropic_api_key=ANTHROPIC_API_KEY\n",
    ")  # swap Anthropic for OpenAI with `ChatOpenAI` and `openai_api_key`\n",
    "output_parser = StrOutputParser()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In typical LangChain we would chain these together using an `LLMChain`:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      " Here is a brief report on some key aspects of artificial intelligence (AI):\n",
      "\n",
      "Introduction\n",
      "- AI refers to computer systems that are designed to perform tasks that would otherwise require human intelligence, such as visual perception, speech recognition, decision-making, and language translation. \n",
      "\n",
      "Major AI Techniques\n",
      "- Machine learning uses statistical techniques and neural networks to enable systems to improve at tasks with experience. Common techniques include deep learning, reinforcement learning, and supervised learning.\n",
      "- Computer vision focuses on extracting information from digital images and videos. It powers facial recognition, self-driving vehicles, and other visual AI tasks.\n",
      "- Natural language processing enables computers to understand, interpret, and generate human languages. Key applications include machine translation, search engines, and voice assistants like Siri.\n",
      "\n",
      "Current Capabilities\n",
      "- AI programs have matched or exceeded human capabilities in narrow or well-defined tasks like playing chess and Go, identifying objects in images, and transcribing speech. \n",
      "- However, general intelligence comparable to humans across different areas remains an unsolved challenge, often referred to as artificial general intelligence (AGI).\n",
      "\n",
      "Future Directions\n",
      "- Ongoing AI research is focused on developing stronger machine learning techniques, achievingexplainability and transparency in AI decision-making, and addressing potential ethical issues like bias.\n",
      "- If achieved, AGI could have significant societal and economic impacts, potentially enhancing and automating intellectual work. However safety, control and alignment with human values remain active research priorities.\n",
      "\n",
      "I hope this brief overview of some major aspects of the current state of AI technology and research provides useful context and information. Let me know if you would like me to elaborate on or clarify anything further.\n"
     ]
    }
   ],
   "source": [
    "from langchain.chains import LLMChain\n",
    "\n",
    "chain = LLMChain(\n",
    "    prompt=prompt,\n",
    "    llm=model,\n",
    "    output_parser=output_parser\n",
    ")\n",
    "\n",
    "# and run\n",
    "out = chain.run(topic=\"Artificial Intelligence\")\n",
    "print(out)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Using LCEL the format is different, rather than relying on `Chains` we simple chain together each component using the pipe operator `|`:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      " Here is a brief report on artificial intelligence:\n",
      "\n",
      "Artificial intelligence (AI) refers to computer systems that can perform human-like cognitive functions such as learning, reasoning, and self-correction. AI has advanced significantly in recent years due to increases in computing power and the availability of large datasets and open source machine learning libraries.\n",
      "\n",
      "Some key highlights about the current state of AI:\n",
      "\n",
      "- Applications of AI - AI is being utilized in a wide variety of industries including finance, healthcare, transportation, criminal justice, and social media platforms. Use cases include personalized recommendations, predictive analytics, automated customer service agents, medical diagnosis, self-driving vehicles, and content moderation.\n",
      "\n",
      "- Machine Learning - The ability for AI systems to learn from data without explicit programming is driving much of the recent progress. Machine learning methods like deep learning neural networks have achieved new breakthroughs in areas like computer vision, speech recognition, and natural language processing. \n",
      "\n",
      "- Limitations - While AI has made great strides, current systems still have major limitations compared to human intelligence including lack of general world knowledge, difficulties dealing with novelty, bias issues from flawed datasets, and lack of skills for complex reasoning, empathy, creativity, etc. Ensuring the safety and controllability of AI systems remains an ongoing challenge.  \n",
      "\n",
      "- Future Outlook - Experts predict key areas for AI advancement to include gaining contextual understanding and reasoning skills, achieving more human-like communication abilities, algorithmic fairness and transparency, as well as advances in specialized fields like robotics, autonomous vehicles, and human-AI collaboration. Careful management of risks posed by more advanced AI systems remains crucial. Global competition for AI talent and computing resources continues to intensify.\n",
      "\n",
      "That covers some of the key trends, strengths and limitations, and future trajectories for artificial intelligence technology based on the current landscape. Please let me know if you would like me to elaborate on any part of this overview.\n"
     ]
    }
   ],
   "source": [
    "lcel_chain = prompt | model | output_parser\n",
    "\n",
    "# and run\n",
    "out = lcel_chain.invoke({\"topic\": \"Artificial Intelligence\"})\n",
    "print(out)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Pretty cool, the way that this pipe operator is being used is that it is taking output from the function to the _left_ and feeding it into the function on the _right_."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## How the Pipe Operator Works"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "To really understand LCEL we can take a look at how this pipe operation works. We know it takes output from the _right_ and feeds it to the _left_ — but this isn't typical Python, so how is this implemented? Let's try creating our own version of this with some simple functions.\n",
    "\n",
    "We will be using the `__or__` method within Python class objects. When we place two classes together like `chain = class_a | class_b` the Python interpreter will check if these classes contain this `__or__` method. If they do then our `|` code will be translated into `chain = class_a.__or__(class_b)`.\n",
    "\n",
    "That means both of these patterns will return the same thing:\n",
    "\n",
    "```python\n",
    "# object approach\n",
    "chain = class_a.__or__(class_b)\n",
    "chain(\"some input\")\n",
    "\n",
    "# pipe approach\n",
    "chain = class_a | class_b\n",
    "chain(\"some input\")\n",
    "\n",
    "```\n",
    "\n",
    "With that in mind, we can build a `Runnable` class that consumes a function and turns it into a function that can be chained with other functions using the pipe operator `|`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [],
   "source": [
    "class Runnable:\n",
    "    def __init__(self, func):\n",
    "        self.func = func\n",
    "\n",
    "    def __or__(self, other):\n",
    "        def chained_func(*args, **kwargs):\n",
    "            # the other func consumes the result of this func\n",
    "            return other(self.func(*args, **kwargs))\n",
    "        return Runnable(chained_func)\n",
    "\n",
    "    def __call__(self, *args, **kwargs):\n",
    "        return self.func(*args, **kwargs)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's implement this to take the value `3`, add `5` (giving `8`), and multiply by `2` (giving `16`)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "16"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "def add_five(x):\n",
    "    return x + 5\n",
    "\n",
    "def multiply_by_two(x):\n",
    "    return x * 2\n",
    "\n",
    "# wrap the functions with Runnable\n",
    "add_five = Runnable(add_five)\n",
    "multiply_by_two = Runnable(multiply_by_two)\n",
    "\n",
    "# run them using the object approach\n",
    "chain = add_five.__or__(multiply_by_two)\n",
    "chain(3)  # should return 16"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Using `__or__` directly we get the correct answer, now let's try using the pipe operator `|` to chain them together:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "16"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# chain the runnable functions together\n",
    "chain = add_five | multiply_by_two\n",
    "\n",
    "# invoke the chain\n",
    "chain(3)  # we should return 16"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Using either method we get the same response, at it's core this is what LCEL is doing, but there is more."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## LCEL Deep Dive"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now that we understand what this syntax is doing under the hood, let's explore it within the context of LCEL and see a few of the additional methods that LangChain has provided to maximize flexibility when working with LCEL."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Runnables"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "When working with LCEL we may find that we need to modify the structure or values being passed between components — for this we can use _runnables_. Let's try:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [],
   "source": [
    "from langchain.embeddings import CohereEmbeddings\n",
    "from langchain.vectorstores import DocArrayInMemorySearch\n",
    "\n",
    "COHERE_API_KEY = \"<<COHERE_API_KEY>>\"\n",
    "if COHERE_API_KEY == \"<<COHERE_API_KEY>>\":\n",
    "    raise ValueError(\"Please set your COHERE_API_KEY\")\n",
    "\n",
    "embedding = CohereEmbeddings(\n",
    "    model=\"embed-english-light-v3.0\",\n",
    "    cohere_api_key=COHERE_API_KEY\n",
    ")\n",
    "\n",
    "vecstore_a = DocArrayInMemorySearch.from_texts(\n",
    "    [\"half the info will be here\", \"James' birthday is the 7th December\"],\n",
    "    embedding=embedding\n",
    ")\n",
    "vecstore_b = DocArrayInMemorySearch.from_texts(\n",
    "    [\"and half here\", \"James was born in 1994\"],\n",
    "    embedding=embedding\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "First let's try passing a question through to a single one of these `vecstore` objects:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [],
   "source": [
    "from langchain_core.runnables import (\n",
    "    RunnableParallel,\n",
    "    RunnablePassthrough\n",
    ")\n",
    "\n",
    "retriever_a = vecstore_a.as_retriever()\n",
    "retriever_b = vecstore_b.as_retriever()\n",
    "\n",
    "prompt_str = \"\"\"Answer the question below using the context:\n",
    "\n",
    "Context: {context}\n",
    "\n",
    "Question: {question}\n",
    "\n",
    "Answer: \"\"\"\n",
    "prompt = ChatPromptTemplate.from_template(prompt_str)\n",
    "\n",
    "retrieval = RunnableParallel(\n",
    "    {\"context\": retriever_a, \"question\": RunnablePassthrough()}\n",
    ")\n",
    "\n",
    "chain = retrieval | prompt | model | output_parser"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      " Unfortunately I do not have enough context to definitively state when James was born. The only potentially relevant information is \"James' birthday is the 7th December\", but this does not specify the year he was born. To answer the question of when specifically James was born, I would need more details or context such as his current age or the year associated with his birthday.\n"
     ]
    }
   ],
   "source": [
    "out = chain.invoke(\"when was James born?\")\n",
    "print(out)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Here we have used `RunnableParallel` to create two parallel streams of information, one for `\"context\"` that is information fed in from `retriever_a`, and another for `\"question\"` which is the _passthrough_ information, ie the information that is passed through from our `chain.invoke(\"when was James born?\")` call.\n",
    "\n",
    "Using this information the chain is close to answering the question but it doesn't have enough information, it is missing the information that we have stored in `retriever_b`. Fortunately, we can have multiple parallel information streams using the `RunnableParallel` object."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {},
   "outputs": [],
   "source": [
    "prompt_str = \"\"\"Answer the question below using the context:\n",
    "\n",
    "Context:\n",
    "{context_a}\n",
    "{context_b}\n",
    "\n",
    "Question: {question}\n",
    "\n",
    "Answer: \"\"\"\n",
    "prompt = ChatPromptTemplate.from_template(prompt_str)\n",
    "\n",
    "retrieval = RunnableParallel(\n",
    "    {\n",
    "        \"context_a\": retriever_a, \"context_b\": retriever_b,\n",
    "        \"question\": RunnablePassthrough()\n",
    "    }\n",
    ")\n",
    "\n",
    "chain = retrieval | prompt | model | output_parser"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      " Based on the context provided, James was born in 1994. This is stated in the second document with the page content \"James was born in 1994\". Therefore, the answer to the question \"when was James born?\" is 1994.\n"
     ]
    }
   ],
   "source": [
    "out = chain.invoke(\"when was James born?\")\n",
    "print(out)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now the `RunnableParallel` object is allowing us to search with `retriever_a` _and_ `retriever_b` in parallel, ie at the same time."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Runnable Lambdas"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The `RunnableLambda` is a LangChain abstraction that allows us to turn Python functions into pipe-compatible function, similar to the `Runnable` class we created near the beginning of this notebook.\n",
    "\n",
    "Let's try it out with our earlier `add_five` and `multiply_by_two` functions."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {},
   "outputs": [],
   "source": [
    "from langchain_core.runnables import RunnableLambda\n",
    "\n",
    "# wrap the functions with RunnableLambda\n",
    "add_five = RunnableLambda(add_five)\n",
    "multiply_by_two = RunnableLambda(multiply_by_two)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "As with our earlier `Runnable` abstraction, we can use `|` operators to chain together our `RunnableLambda` abstractions."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "metadata": {},
   "outputs": [],
   "source": [
    "chain = add_five | multiply_by_two"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Unlike our `Runnable` abstraction, we cannot run the `RunnableLambda` chain by calling it directly, instead we must call `chain.invoke`:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "16"
      ]
     },
     "execution_count": 25,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "chain.invoke(3)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "As before, we can see the same answer. Naturally we can feed custom functions into our chains using this approach. Let's try a short chain and see where we might want to insert a custom function:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 32,
   "metadata": {},
   "outputs": [],
   "source": [
    "prompt_str = \"Tell me an short fact about {topic}\"\n",
    "prompt = ChatPromptTemplate.from_template(prompt_str)\n",
    "\n",
    "chain = prompt | model | output_parser"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 34,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "\" Here's a short fact about artificial intelligence:\\n\\nAI systems can analyze huge amounts of data and detect patterns that humans may miss. For example, AI is helping doctors diagnose diseases earlier by processing medical images and spotting subtle signs that a human might not notice.\""
      ]
     },
     "execution_count": 34,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "chain.invoke({\"topic\": \"Artificial Intelligence\"})"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 35,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "\" Here's a short fact about artificial intelligence:\\n\\nAI systems are able to teach themselves over time. Through machine learning, algorithms can analyze large amounts of data and improve their own processes and decision making without needing to be manually updated by humans. This self-learning ability is a key attribute enabling AI progress.\""
      ]
     },
     "execution_count": 35,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "chain.invoke({\"topic\": \"Artificial Intelligence\"})"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The returned text always includes the initial `\"Here's a short fact about ...\\n\\n\"` — let's add a function to split on double newlines `\"\\n\\n\"` and only return the fact itself."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 36,
   "metadata": {},
   "outputs": [],
   "source": [
    "def extract_fact(x):\n",
    "    if \"\\n\\n\" in x:\n",
    "        return \"\\n\".join(x.split(\"\\n\\n\")[1:])\n",
    "    else:\n",
    "        return x\n",
    "    \n",
    "get_fact = RunnableLambda(extract_fact)\n",
    "\n",
    "chain = prompt | model | output_parser | get_fact"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 37,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'Most AI systems today are narrow AI, meaning they are focused on and trained for a specific task like computer vision, natural language processing or playing chess. General artificial intelligence that has human-level broad capabilities across many domains does not yet exist. AI has made tremendous progress in recent years thanks to advances in deep learning, big data and computing power, but still has limitations and scientists are working to make AI systems safer and more beneficial to humanity.'"
      ]
     },
     "execution_count": 37,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "chain.invoke({\"topic\": \"Artificial Intelligence\"})"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 38,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'AI systems can analyze massive amounts of data and detect patterns that humans may miss. This ability to find insights in large datasets is one of the key strengths of AI and enables many practical applications like personalized recommendations, fraud detection, and medical diagnosis.'"
      ]
     },
     "execution_count": 38,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "chain.invoke({\"topic\": \"Artificial Intelligence\"})"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now we're getting well formatted outputs using our final custom `get_fact` function.\n",
    "\n",
    "---\n",
    "\n",
    "That covers the essentials you need to getting started and building with the **L**ang**C**hain **E**xpression **L**anguage (LCEL). With it, we can put together chains very easily — and the current focus of the LangChain team is on further LCEL development and support.\n",
    "\n",
    "The pros and cons of LCEL are varied. Those that love it tend to focus on the minimalist code style, LCEL's support for streaming, parallel operations, and async, and LCEL's nice integration with LangChain's focus on chaining components together.\n",
    "\n",
    "There are also people that are less fond of LCEL. Typically people will point to it being yet another abstraction on top of an already very abstract library, that the syntax is confusing, against [the Zen of Python](https://peps.python.org/pep-0020/), and requires too much effort to learn new (or uncommon) syntax.\n",
    "\n",
    "Both viewpoints are entirely valid, LCEL is a very different approach — ofcourse there are major pros and major cons. In either case, if you're willing to spend some time learning the syntax, it allows us to develop very quickly, and with that in mind it's well worth learning."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "ml",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.12"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
